Improving Performance by Re-Rating in the Dynamic Estimation of Rater Reliability

نویسندگان

Alexey Tarasov

Sarah Jane Delany

چکیده

Nowadays crowdsourcing is widely used in supervised machine learning to facilitate the collection of ratings for unlabelled training sets. In order to get good quality results it is worth rejecting results from noisy/unreliable raters, as soon as they are discovered. Many techniques for filtering unreliable raters rely on the presentation of training instances to the raters identified as most accurate to date. Early in the process, the true rater reliabilities are not known and unreliable raters may be used as a result. This paper explores improving the quality of ratings for training instances by performing re-rating. The re-rating relies on the detection of such instances and the acquisition of additional ratings for them when the rating process is over. We compare different approaches to re-rating and compare the improvements in labeling accuracy and the labeling costs of these approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Test-re-test reliability and inter-rater reliability of a digital pelvic inclinometer in young, healthy males and females.

Objective. The purpose of this study was to investigate the reliability of a digital pelvic inclinometer (DPI) for measuring sagittal plane pelvic tilt in 18 young, healthy males and females. Method. The inter-rater reliability and test-re-test reliabilities of the DPI for measuring pelvic tilt in standing on both the right and left sides of the pelvis were measured by two raters carrying out t...

متن کامل

A Study of Raters’ Behavior in Scoring L2 Speaking Performance: Using Rater Discussion as a Training Tool

The studies conducted so far on the effectiveness of resolution methods including the discussion method in resolving discrepancies in rating have yielded mixed results. What is left unnoticed in the literature is the potential of discussion to be used as a training tool rather than a resolution method. The present study addresses this research gap by exploring the data coming from rating behavi...

متن کامل

Towards a Task-Based Assessment of Professional Competencies

Performance assessment is exceedingly considered a key concept in teacher education programs worldwide. Accordingly, in Iran, a national assessment system was proposed by Farhangian University to assess the professional competencies of its ELT graduates. The concerns regarding the validity and authenticity of traditional measures of teachers' competencies have motivated us to devise a localized...

متن کامل

English and Non English major Teachers’ Assessment of Oral Proficiency: a case of Iranian Maritime English Learners

Speaking assessment is still construed as a complicated, under-researched process from the vantage point of tasks and rater characteristics. The present study aimed at investigating if and how English Major and none English Major teachers differ in their perception of the construct of oral proficiency while assessing learners’ L2 oral proficiency. To this end, 38 male and female non-native EFL...

متن کامل

Improving the velocity tracking of cruise control system by using adaptive methods

Accurate and correct performance of controller in cruise control systems is important. Hence, in such systems, controller should optimize itself against noise and probable changes in system dynamic. As a matter of fact, in this article three approaches have been conducted to-ward this purpose: MIT, direct estimation and indirect estimation. These approaches are used as controllers to track refe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Improving Performance by Re-Rating in the Dynamic Estimation of Rater Reliability

نویسندگان

چکیده

منابع مشابه

Test-re-test reliability and inter-rater reliability of a digital pelvic inclinometer in young, healthy males and females.

A Study of Raters’ Behavior in Scoring L2 Speaking Performance: Using Rater Discussion as a Training Tool

Towards a Task-Based Assessment of Professional Competencies

English and Non English major Teachers’ Assessment of Oral Proficiency: a case of Iranian Maritime English Learners

Improving the velocity tracking of cruise control system by using adaptive methods

عنوان ژورنال:

اشتراک گذاری